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Abstract — This  paper  presents  a multiclass,  multilabel  im- 
plementation of  Least  Squares  Support  Vector  Machines  (LS- 
SVM)  for  DOA  estimation  in  a CDMA  system.  For  any 
estimation  or  classification  system  the  algorithm’s  capabilities 
and  performance  must  be  evaluated.  This  paper  includes  a vast 
ensemble  of  data  supporting  the  machine  learning  based  DOA 
estimation  algorithm.  Accurate  performance  characterization  of 
the  algorithm  is  required  to  justify  the  results  and  prove  that 
multiclass  machine  learning  methods  can  be  successfully  applied 
to  wireless  communication  problems.  Thel  earning  algorithm 
presented  in  this  paper  includes  steps  for  generating  statistics 
on  the  multiclass  evaluation  path.  The  error  statistics  provide 
a confidence  level  of  the  classification  accuracy. 


I.  INTRODUCTION 

Machine  leamingr  esearch  has  largely  been  devoted  to 
binary  and  multiclass  problems  relating  to  data  mining,  text 
categorization,  and  pattern  recognition.  Recently,  machine 
learning  techniques  have  been  applied  to  various  problems 
relating  to  cellular  communications,  notably  spread  spectrum 
receiver  design,  channel  equalization,  and  adaptive  beam- 
forming with  direction  of  arrival  estimation  (DOA).  In  our 
research  we  present  a machine  learning  based  approach  for 
DOA  estimation  in  a CDMA  communication  system  [1], 
The  DOA  estimates  are  used  in  adaptive  beamforming  for 
interference  suppression,  a critical  component  in  cellular 
systems. I nterference  suppression  reduces  the  multiple  access 
interference  (MAI)  which  lowers  the  required  transmit  power. 
The  interference  suppression  capability  directly  influences 
the  cellular  system  capacity,  i.e.,t  he  number  of  active  mobile 
subscribers  per  cell. 

Beamforming,  tracking,  and  DOA  estimation  are  current 
research  topics  with  various  technical  approaches.  Least 
mean  square  estimation,  Kalman  filtering,  and  neural  net- 
works [2], [3], [4],  have  been  successfully  applied  to  these 
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problems.  Many  approaches  have  been  developed  fore  alcu- 
lating  the  DOA;  three  techniques  based  on  signal  subspace 
decomposition  are  ESPRIT,  MUSIC,  and  Root-MUSIC  [1]. 

Neural  networks  have  been  successfully  applied  to  the 
problem  of  DOA  estimation  and  adaptiveb  eamforming  in 
[4],  [5],  [6].  New  machine  learning  techniques,  such  as 
support  vector  machines(  SVM)  and  boosting  [7],  perform 
exceptionally  well  in  multiclass  problems  and  new  op- 
timization techniques  are  published  regularly.  These  new 
machine  learning  techniques  have  the  potential  to  exceed 
the  performance  of  then  eural  network  algorithms  relating 
to  communication  applications. 

The  machine  learning  methods  presented  in  this  paper 
include  sub  space  based  estimation  applied  to  the  sample 
covariance  matrix  of  the  received  signal.  The  one-vs-one 
multiclass  LS-SVM  algorithm  uses  both  training  data  and 
received  data  to  generate  the  DOA  estimates.  The  end  result 
is  an  efficient  approach  for  estimating  the  DOAs  in  CDMA 
cellular  architecture  [1]. 

This  paper  is  organized  as  follows.  Section  II  presents 
the  system  models  for  an  adaptive  antenna  array  CDMA 
systems.  A review  of  binary  and  multiclass  machine  learning 
methods  is  presented  in  Section  III,  along  with  background 
information  on  the  LS-SVM  algorithm.  Section  IV  includes 
a brief  review  of  classic  DOA  estimation  algorithms  and 
the  elements  of  a machine  learning  based  DOA  estimation 
algorithm.  Section  V presents  a one-vs-onem  ulticlass  LS- 
SVM  algorithm  for  DOA  estimation  and  simulation  results 
are  presented  in  Section  VI.  Section  VII  includes  a compar- 
ison between  standard  DOA  estimation  algorithmsa  nd  our 
machine  learning  based  algorithm. 

II.  System  Models 

This  section  includes  an  overview  of  system  models  for 
the  received  signal  and  adaptive  antenna  arrays  designs. 
All  notation  is  described  below  and  is  consistently  used 
throughout  the  paper. 
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A.  Received  Signal  at  Antenna  Array  output 

The  baseband  signal,  rA  (£),  from  the  antenna  array  is 


rA  (t)  - As  (t)  + nr  (t) . (1) 

A = [a(«0  a(02)  ...  a (0L)  ] (2) 

&{0i)  = [ 1 e~jk‘  e~j2k‘  ...  e-J(£>-i)fc,  ]T(3) 

s(t)  = [ si  (n)  s2  (n)  ...  sL(n)  ]T  (4) 

si  (*)  = y/PuTfifoi  (t) , for  path  Z,  (5) 


where  (t)  is  the  received  signal  of  mobile  z,  A is  a 
D x L array  steering  vector  for  D antenna  elements  and 
L transmission  paths,  s (t)  is  the  L x 1 received  base- 
bands ignal  at  the  output  of  the  matched  filter,  a (0£)  = 
[ 1 e"J*fc|  ...  e_J(D_1)fc|  ]T  is  the  D x 1 steering 
vector,  ki  = ^sinfy,  z;  is  the  spacing  between  antenna 
elements,  wc  is  the  carrier  frequency,  c is  the  velocity  of 
propagation,  0*is  the  direction  of  arrival  of  the  l signal, 
pt.  (£)  is  the  transmit  signal  power  from  mobile  z,  q\  is  the 
attenuation  due  to  shadowing  from  path  /,  (t)  is  the  data 

stream  of  mobile  z,  and  nr  (t)  is  the  additive  noise  vector. 

To  ease  the  complexity  of  the  notation  the  terms  relative 
to  the  multiple  paths  are  combined  as 

L 

ziz=^2&(ei)<ll  (6) 

i=i 

In  [8]  z i is  defined  as  the  spatial  signature  of  the  antenna 
array  to  the  ith  mobile. 

III.  Support  Vector  Machines  - Background 

A major  machine  learning  application,  pattern  classifi- 
cation, observes  input  data  and  applies  classification  rules 
to  generate  a binary  or  multiclass  labels.  In  the  binary 
case,  a classification  function  is  estimated  using  input/output 
training  pairs, (x;,y;)  z = 1 . . .n,  with  unknown  probability 
distribution,  P(x,y), 

(xi,  2/1 ),...,  (xn,2/n)  6 RwxY,  (7) 

Vi  = {-1.+1}-  (8) 

The  estimated  classification  function  maps  the  input  to  a 

binary  output,  / : RN  -a  {-1,4-1}.  The  system  is  first 

trained  with  the  given  input/output  data  pairs  then  the  test 
data,  taken  from  the  same  probability  distribution  P(x,  y),  is 
applied  to  the  classification  function.  For  the  multiclass  case 
Y e RG  where  Y is  a finite  set  of  real  numbers  and  G is  the 
size  of  the  multiclass  label  set.  I n multi  class  classification  the 
objective  ist  o estimate  the  function  which  maps  the  input 
data  to  a finite  set  of  output  labels  / : RN  — > S ( RN ) € RG 

Support  Vector  Machines  (SVMs)  were  originally  de- 
signed for  the  binary  classification  problem.  Much  like 
all  machine  learning  algorithmsS  VMs  find  a classification 
function  that  separates  data  classes,  with  the  largest  margin, 


using  a hyperplane  . The  data  points  near  the  optimal  hyper- 
plane are  the  “support  vectors”.  SVMs  are  a nonparametric 
machine  learning  algorithm  with  the  capability  of  controlling 
the  capacity  through  the  support  vectors. 

A.  Kernel  Functions 

The  kernel  based  SVM  maps  the  input  space  into  a higher 
dimensional  feature,  F,  space  via  a nonlinearm  apping 

r : Rn^F  (9) 

x r (x) . (io) 

The  data  does  not  have  the  same  dimensionality  as  the  feature 
space  since  the  mapping  process  is  to  a non-unique  general- 
ized surface  [9].  The  dimension  of  the  feature  space  is  not  as 
important  as  the  complexity  of  the  classification  functions. 
For  example,  in  the  input  space,  separating  the  input/output 
pairsm  ay  require  a nonlinear  separating  function,  but  in  a 
higher  dimension  feature  space  the  input/output  pairs  may  be 
separated  with  a linear  hyperplane.  The  nonlinear  mapping 
function  r(x*)  is  related  to  kernel,  k(x,xi)  by 

r(x)T  (xj)  = k(x,xj).  (11) 

Four  popular  kernel  functions  are  the  linear  kernel,  poly- 
nomial kernel,  radial  basis  function  (RBF),  and  multilayer 
perceptrons  (MLP). 


linear,  k(x,xi) 

— x • Xi 

(12) 

polynomial,  k(x,xi) 

= ((x  ■ Xi)  +0)d 

(13) 

RBF,  k(x,xi) 

(14) 

MLP,  k (x,xf) 

= tanh  (k  (x  ■ x*)  +6) 

(15) 

The  performance  of  each  kernel  function  varies  with  the 
characteristics  of  the  input  data.  Refer  to  [10]  for  more 
information  on  feature  spaces  and  kernel  methods. 

B.  Binary  Classification 

In  binary  classification  systems  the  machine  learning  algo- 
rithm generate  the  output  labels  with  a hyperplane  separation 
where  yi  e [—1,1]  represents  the  classification  “label”  of  the 
input  vector  x . The  input  sequence  and  a set  of  training 
labels  are  represented  as  {xi,z/i}"=1  , yi  = {— 1, 4*1}  . If  the 
two  classes  are  linearly  separable  in  the  input  space  then  the 
hyperplane  is  defined  as  wTx4-b  = 0,w  is  a weight  vector 
perpendicular  to  the  separating  hyperplane,  b isa  biast  hat 
shifts  the  hyperplane  parallel  to  itself.  If  the  input  space  is 
projected  into  a higher  dimensional  feature  spacet  hen  the 
hyperplane  becomes  wtT  (x)  4 -b  = 0. 

The  SVM  algorithm  is  based  on  the  hyperplane  definition 

[H], 

Vi  [wTr (Xi)  +b]  >l,i  = l,...,N. 


(16) 
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Given  the  training  sets  in  (7)  the  binary  support  vector 
machine  classified  s defined  as 


y (x)  = sign 


" N 

Y otiVik  (x,  x{)  + b . 

.4=1 


(17) 


The  non-zero  a[s  are  “support  values”  and  the  corresponding 
data  points,  x*,  are  the  “support  vectors”.  Quadratic  pro- 
gramming is  one  method  ofs  olving  for  the  o^s  and  b in  the 
standard  SVM  algorithm. 


C.  Multiclass  Classification 

For  the  multiclass  problem  the  machine  learning  algorithm 
produces  estimates  with  multiple  hyperplane  separations. 
The  set  of  input  vectors  and  training  labels  is  defined  as 
{x«>?/i K= 1,C=1  > Xj  € R",  l/«  € {1, n is  the  index 
of  the  training  pattern  and  C is  the  number  of  classes.  There 
exist  many  SVM  approaches  to  multiclass  classification  prob- 
lem. Two  primary  multiclass  techniques  are  one-vs-one  and 
one-vs-rest.  One-vs-onea  pplies  SVMs  to  selected  pairs  of 
classes.  For  C distinct  classes  there  are  — — 1 hyperplanes 
that  separate  the  classes.  The  one-vs-rest  SVM  technique 
generates  C hyperplanes  that  separate  each  distinct  class 
from  the  ensemble  of  the  rest.  In  this  paper  we  only  consider 
the  one-vs-one  multiclass  SVM. 

Platt,  et.al.,  [12]  introduced  the  decision  directed  acyclic 
graph  (DDAG)  and  a Vapnik-Chervonenkis  (VC)  analysis 
of  the  margins.  The  DDAG  technique  isb  ased  on  ZiCn1) 
classifiers  fora  C class  problem,  one  node  for  each  pair  of 
classes.  In  [12]  it  is  proved  that  maximizing  the  margins  at 
each  node  of  the  DDAG  will  minimize  the  generalization 
error.  The  performance  benefit  of  the  DDAG  architecture, 
is  realized  when  the  ith  classifier  is  selected  at  the  ith/jth 
node  and  the  jth  class  is  eliminated.  Refer  to  Figure  1 for  a 
diagram  of  a fourc  lass  DDAG. 


Fig.  1.  Fourc  lass  DDAG  for  one-vs-one  multiclass  LS-SVM  based  DOA 
estimation. 


D.  Least  Squares  SVM 

Suykens,  et.al.,  [13]  introduced  the  LS-SVM  which  is 
based  on  the  SVM  classifier,  refer  to  equation  (17) . The  LS- 
SVM  classifier  is  generated  from  the  optimization  problem: 

min  £LS(w,^)  = i||w||2  + L^</>?.  (18) 

W,6,<£  l L z — * 

1=1 

7 and  </>•  are  the  regularization  and  error  variables,  respec- 
tively. The  minimization  in  (18) includes  the  constraints 

Vi  [wTr  (Xj)  +6]  > 1 - 1 = 1, . . . , n,  (19) 

The  LS-SVM  includes  one  universal  parameter,  7,  that 
regulates  the  complexity  of  the  machine  learning  model.T  his 
parameter  is  applied  to  the  data  in  the  feature  space,  the 
output  of  the  kernel  function.  A small  value  of  7 minimizes 
the  model  complexity,  while  a large  value  of  7 promotes 
exact  fitting  to  the  training  points.  The  error  variable 
allows  misclassifications  foro  verlapping  distributions  [14]. 
The  Lagrangian  ofe  quation  (18)  is  defined  as 

Zls  (w M,  a)  = CLS  (w,6,<£)  - (20) 

n 

Y2  ai  fri  [w7T  (x«)  +b]  + 

i~  1 

where  a*  are  Lagrangian  multipliers  that  can  either  be 
positive  or  negative.  The  conditions  of  optimality  are 


d%LS 

dw 

d-ZLs 

db 

dZhs 

df 

dZLs 

dai 


n 


= 0, 

w = V'Qiyir(xi) 

(21) 

1=1 

= 0, 

'%L^aiyi  = 0 

M 

(22) 

= 0, 

oti  = 7 fa 

(23) 

= 0, 

yi  [wTr  (xi)  +6]  - 1 + 4>i  = o 

(24) 

A linear  system  can  be  constructed  from  equations  (21)  — 
(24)  [13], 


' 7 0 0 -ZT  1 

w 

' 0 " 

0 0 0 - YT 

b 

0 

1 

o 

o 

4> 

0 

Z Y I 0 J 

a 

T 

Z = [r  (xi)Tt/i,...,r(xn):r2/n]  (26) 

V = fob"  ■.!*.],  T = [1,. ..  ,1]  (27) 

4*  = [4*1  j • • * i 0n]  > ® [^1  j ■ • • ) ^n]  (28) 


By  eliminating  weight  vector  w and  the  error  variable  <f),  the 
linear  system  is  reduced  to: 


' 0 YT 

“ b ‘ 

0 " 

Y ZZt+7-17 

a 

T 

(29) 
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In  the  linear  systems  defined  in  (25)  — (29)  the  support  values 
<*i  are  proportional  to  the  errors  at  thed  ata  points.  In  the 
standard  SVM  case  many  of  these  support  values  are  zero, 
but  most  of  the  least  squares  support  values  are  non-zero.  In 
[13]  a conjugate  gradient  method  isp  roposed  for  finding  b 
and  a,  which  are  required  for  the  SVM  classifier  in  equation 
(17). 

IV.  Algorithms  for  DOA  Estimation 

Two  primary,  classic  methods  for  subspace  based  DOA 
estimation  exist  in  literature,  Multiple  Signal  Classification 
(MUSIC)  [15]  and  Estimation  ofS  ignal  Parameters  Via  Ro- 
tational Invariance  Techniques  (ESPRIT)  [16].  The  MUSIC 
algorithm  is  based  on  the  noise  subspace  and  ESPRIT  is 
based  on  the  signal  subspace. 

Many  computational  techniques  exist  for  working  through 
limitationso  f DOA  estimation  techniques,  but  currently  no 
techniques  exist  for  as  ystem  level  approach  to  accurately 
estimating  the  DOAs  at  the  base  station.  A number  of  lim- 
itations relating  to  popular  DOA  estimation  techniques  are: 

1)  the  signals  ubspace  dimension  is  not  known,  many  papers 
assume  that  it  is.T  he  differences  between  the  covariance  ma- 
trix and  the  sample  covariance  matrix  add  to  the  uncertainty, 

2)  searching  all  possible  angles  to  determine  the  maximum 
response  of  the  MUSIC  algorithm,  3)  evaluating  the  Root- 
MUSIC  polynomial  on  the  unit  circle,  4)  multiple  eigen 
decompositions  for  ESPRIT,  5)  computational  complexity  for 
maximum  likelihood  method.  The  capabilities,  in  terms  of 
resolution  and  computational  requirements,  of  these  standard 
DOA  estimation  algorithms  serve  as  the  benchmark  for  the 
machinel  earning  based  DOA  estimation.  Refer  to  Section 
VII  for  a comparison  between  standard  DOA  estimation 
algorithms  and  the  one-vs-one  multiclass  LS-SVM  DOA 
estimation  algorithm. 

A.  Machine  Learning  for  DOA  Estimation 

To  estimate  the  antenna  array  response,  z j = 

X4L1  a (®t)  Qj’  we  must  know  a(0/)  and  q\.  The  contin- 
uous pilot  signal,  includedi  n cdma2000,  can  be  used  in 
estimating  q\.  This  must  be  done  for  each  resolvable  path, 
i.e.,  qi  = [ q},  qf,  ...  ,qf  1.  Estimating  A(0)  = 
[ a (0i ) , a (#2)  > • • ■ ,a {Of)  J requires  information  on 

the  DOA. 

The  process  of  DOA  estimation  is  to  monitor  the  outputs 
of  D antenna  elements  and  predict  the  angle  of  arrival  of  L 
signals,  L<D.  The  output  matrix  from  the  antenna  elements 
is 

A = [ a(0,)  a(02)  •••  a(0£)  ] (30) 

a (0j)  = [ 1 e~jk‘  e~i2k‘  ...  ]T, 

and  the  vector  of  incident  signals  is  0r  — 
[0i,  02,  ...  , 0£  ].  With  a training  process, 


the  learning  algorithms  generate  DOA  estimates, 
0r  — [ 0i,  02,  ...  , 0/,  ] , based  on  the  responses  from 

the  antenna  elements,  a (61). 

For  thep  roposed  machinel  earning  technique  there  is  a 
trade-off  between  the  accuracy  of  the  DOA  estimation  and 
antennaa  rray  beamwidth.  An  increase  in  DOA  estimation 
accuracy  translates  into  a smaller  beamwidth  and  a reduction 
in  MAI.  Therefore  the  accuracy  in  DOA  estimation  directly 
influences  the  minimum  required  power  transmitted  by  the 
mobile.  There  should  be  a balance  between  computing  effort 
and  reduction  in  MAI. 

V.  LS-SVM  DDAG  based  DOA  Estimation 
Algorithm 

In  this  paper  we  propose  a multiclass  SVM  algorithm 
trained  with  projection  vectors  generated  from  the  signal 
subspace  eigenvectors  and  the  sample  covariance  matrix.T  he 
output  labels  from  the  SVM  system  are  the  DOA  estimates. 

The  one-vs-one  multiclass  LS-SVM  DDAG  technique  for 
DOA  estimation  is  trained  for  C DOA  classes.  The  DDAG 
tree  is  initialized  with  c^c~1)  nodes.  Therefore  c(c^1) 
one-vs-one  LS-SVMs  are  trained  to  generated  the  hyper- 
planes with  maximum  margin.  For  each  class  thet  raining 
vectors,  xn,  areg  enerated  from  thee  igenvectors  spanning 
the  signal  subspace.  The  number  of  classes  is  dependent 
upon  on  the  antenna  sectoring  and  required  resolution.  For  a 
CDMA  system  the  desired  interference  suppression  dictates 
the  fixed  beamwidth.  CDMA  offers  this  flexibility  since 
the  all  mobiles  use  the  same  carrier  frequency.  For  FDMA 
systems  a narrow  beamwidth  is  desired,  since  frequency 
reuse  determines  the  capacity  of  a cellular  system. 

Thes  ignal  subspace  eigenvectors  of  the  received  signal 
covariance  matrix  are  required  for  accurate  DOA  estimation. 
For  a CDMA  system  with  adaptive  antenna  arrays  the 
covariance  matrix  of  the  received  signal  is 

RrP  = E [rAr£] . (31) 

In  our  machine  learning  based  DOA  estimation  algorithm 
the  principal  eigenvectors  must  be  calculated.  Eigen  decom- 
position (ED)  is  the  standard  computational  approach  for 
calculating  the  eigenvalues  and  eigenvectors  of  a the  co- 
variance  matrix.  ED  is  a computationally  intense  technique, 
faster  algorithmss  uch  asP  ASTd  [17]  have  been  developed 
forr  eal-time  processing  applications. 

For  the  LS-SVM  based  approach  to  DOA  estimation 
the  output  of  the  receiver  is  used  to  calculate  the  sample 
covariance  matrix  of  the  input  data  signal  ( k ) , 

1 K 

^rr  = M S Ta  W rA  ' @2) 

k=K — M +1 

The  dimension  of  the  observation  matrix  is  D x M,  M is 
ideal  sample  size  (window  length),  and  the  dimension  of  the 
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TABLE  I 

Projection  coefficients  for  machine  learning  based  power 
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sample  covariance  matrix  is  D x D.  The  principal  eigen- 
vectors, Vi, ... , vo,  are  calculated  via  eigen  decomposition 
(ED)  or  subspacet  racking  techniques  vEach  eigenvector  is 
used  to  calculate  a covariance  matrix,  RVUl , . . . , KVVd  . 

The  algorithm  requires  only  the  set  of  estimated  eigenvec- 
tors f rom  the  sample  covariance  matrix,  which  are  used  to 
generate  projection  coefficients  for  the  classification  process. 
Thejprojection  vectors  are  generated  from  the  projection 
of  1 < d < D,  onto  the  primary  eigenvector  of 

the  signal  subspace.  In  the  training  phaset  he  hyperplanes 
at  each  DDAG  node  are  constructed  with  thesep  rojection 
vectors.  In  the  testing  phase  RVUrf  is  generated  from  the 
received  signal  (k)  and  the  principal  eigenvectors.  Then 
the  projection  coefficients  for  the  ith/jth  node  of  the  DDAG 
are  computed  with  dot  products  of  Rvlv  and  the  ith/jth 
training  eigenvectors.  Thisn  ew  set  of  projection  vectors i s 
testing  with  the  ith/jth  hyperplane  generated  during  the 
training  phase.  The  DOA  labels  are  then  assigned  based  on 
the  DDAG  evaluation  path.  A similar  projection  coefficient 
technique  has  been  successfully  applied  to  a multiclass  SVM 
facialr  ecognition  problem  presented  in  [18].T  able  I includes 
threes  ets  of  projection  vectors,  each  set  corresponds  to  a 
different  DOA.  From  a review  of  the  data  it  is  evident  that  the 
classes  are  not  linearly  separable.  The  data  must  be  projected 
to  a higher  dimension  feature  space  and  tested  against  the 
separating  hyperplane. 

The  following  algorithm  for  the  one-vs-one  multiclass  LS- 
SVM  implementation  for  DOA  estimation  includesp  repro- 
cessing, training,  and  testing  steps. S pecifkally,  the  algorithm 
requires  two  sets  of  projection  vectors  for  each  DDAG  node. 
This  allows  for  automatic  MSE  calculations  at  each  step  of 
the  DDAG  evaluation  path,  thus  providing  a unique  method 
fore  rror  control  and  validation. 

• Preprocessing  for  SVM  Training 

1)  Generate  the  D x N training  signal  vectors  for  the 
C LS-SVM  classes,  D is  the  number  of  antenna 
elements,  N is  the  number  of  samples. 

2)  Generate  the  C sample  covariance  matrices, 
U,with  M samples  from  the  D x data  vector. 

3)  Calculate  the  signal  eigenvector,  S,  from  each  of 


the  C sample  covariance  matrices. 

4)  Calculate  the  D x 1 projection  vectors,  U-S,  for 
each  of  the  C classes.  The  ensemble  of  projection 
vectors  consists  of  samples. 

5)  Store  the  projection  vectors  for  the  training  phase 
and  the  eigenvectors  for  the  testing  phase. 

• LS-SVM  Training 

1)  With  the  C projection  vectors  train  the  c^c2—^ 
nodes  with  the  one-vs-one  LS-SVM  algorithm. 

2)  Store  the  LS-SVM  variables,  a*  and  b from  equa- 
tion (17) , which  define  the  hyperplane  separation 
fore  ach  DDAG  node. 

• Preprocessing  for  SVM  Testing 

1)  Acquire  D x N input  signal  from  antenna  array, 
this  signal  has  unknown  DOAs. 

2)  Generate  the  sample  covariance  matrix  with  M 
samples  from  the  D x N data  vector. 

3)  Calculate  the  eigenvectors  for  the  signal  subspace 
and  the  noise  subspace. 

4)  Generate  the  covariance  matrices f or  each  eigen- 
vector. 

• LS-SVM  Testing  for  the  i/j  DDAG  Node 

1)  Calculate  TWO  Dx  1 projection  vectors  with  the 
desired  eigenvector  covariance  matrix  and  the  ith 
and  jth  eigenvectors  from  the  training  phase. 

2)  Test  both  projection  vectorsa  gainst  the  LS-SVM 
hyperplane  for  the  i/j  node.  This  requires  two 
separate  LS-SVM  testing  cycles,  one  with  the 
projection  vector  from  the  ith  eigenvector  and  one 
with  the  projection  vector  from  the  jth  eigenvec- 
tor. 

3)  Calculate  the  mean  value  of  the  two  LS-SVM 
output  vectors  (labels).  Select  the  mean  value  that 
is  closest  to  a decision  boundary,  0 or  1 . Compare 
this  value  to  the  label  definition  at  the  node,  then 
select  the  proper  label. 

4)  Repeat  process  for  the  next  DDAG  node  in  the 
evaluation  path  or  declare  the  final  DOA  label. 

• Error  Control 

1)  Review  the  MSE  calculations  for  the  DDAG  eval- 
uation path. 

2)  Apply  error  control  and  validation  measures  to 
classify  the  label  as  either  an  accurate  DOA  es- 
timate or  as  NOISE. 

VI.  Simulation  Results 

Two  simulation  plots  are  included  below.  Each  simulation 
consists  of  af  our  class  LS-SVM  DDAG  system.  Figure  2 
shows  results  for  a ten  degree  range  per  class.  Figure  3 shows 
results  for  a one  degree  range  per  class. 

The  antenna  array  includes  eight  elements,  therefore  the 
training  and  test  signals  were  8x1  vectors.  The  training 
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and  test  signals  are  the  complex  outputs  from  the  antenna 
array.  The  received  complex  signal  ism  odeled  with  a zero 
mean  normal  distribution  with  unit  variance;  the  additive 
noise  includes  a zero  mean  distribution  with  a 0.2  variance. 
This  combination  of  signal  and  noise  power  translates  into  a 
7dB  SIR . 

The  system  training  consists  of  six  DDAG  nodes  for  the 
four  DOA  classes.  Both  the  training  and  test  signals  consisted 
of  1500  samples  and  the  window  length  of  the  sample 
covariance  matrix  was  set  to  five.  Therefore  the  training 
and  test  sets  were  composed  of  300  samples  of  each  8x1 
projection  vector. 

To  completely  test  the  LS-SVM  DDAG  system’s  capa- 
bilities the  simulation  were  automated  to  test  a wide  range 
of  DOAs.  The  DOA  test  set  consisting  of  signals  ranging 
from  three  degrees  before  the  first  DOA  class  to  three 
degrees  after  the  last  DOA  class.  Thus  there  were  forty- 
six  test  signals  for  Figure  2 and  fourteen  test  signals  for 
Figure  3.  As  can  been  seen  from  the  two  plots  the  LS-SVM 
DDAG  DOA  estimation  algorithm  is  extremely  accurate.  No 
misclassifications  were  logged.  Testing  shows  that  the  LS- 
SVM  DDAG  system  accurately  classifies  the  DOAs  fora  ny 
desired  number  of  classes  and  DOA  separations  from  one 
degree  to  twenty  degrees. 


Fig.  2.  LS-SVM  for  DOA  estimation,  four  classes  with  ten  degree 
separation  between  each. 


A.  Decision  Grids 

The  decision  grid  (DG)  technique  was  developed  to  track 
the  DDAG  evaluation  path  and  generate  statistics  to  char- 
acterize the  confidence  level  of  the  DOA  classifications. 
The  theoretical  DG  (T-DG)  is  a technique  we  developed  to 
quantify  errors  and  add  insighti  nto  the  robustness  of  the  LS- 
SVM  DDAG  architecture.  The  T-DG  is  a deterministic  2D 
grid  for  DDAGs  with  a relatively  small  number  of  classes  and 
small  DOA  range  between  classes.T  he  elements  of  the  T-DG 


ML  DOA  Estimates 


DOA  Test  Signals 
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Fig.  3.  LS-SVM  for  DOA  estimation,  four  classes  with  one  degree 
separation  between  each. 


represent  the  deterministic  values  of  the  two  LS-SVM  labels 
at  each  DDAG  level,  thed  eterministic  values  are  referred 
to  as“  theoretical  decision  statistics”.  Designing  T-DGs  for 
DDAGs  with  threet  o five  classes  and  DOA  ranges  up  to 
five  degrees  between  classes  is  straight  forward.  The  T-DGs 
are  not  deterministic  for  large  DOA  ranges,  i.e.  for  a DOA 
range  of  ten  degrees  between  classes  empirical  results  show 
that  the  DDAG  evaluation  path  isu  npredictable.  The  large 
DOA  ranges  lead  to  uncertainty  in  the  evaluation  path,  even 
though  the  test  DOA  is  classified  correctly. 

Empirical  decision  grids  (E-DG)  are  automatically  gener- 
ated in  the  LS-SVM  DDAG  DOA  estimation  algorithm.  The 
E-DGs  tabulate  the  mean  of  the  LS-SVM  output  label  vectors 
at  each  DDAG  node  and  level,  the  mean  values  are  referred  to 
as  “decision  statistics”.  The  unique  design  of  this  algorithm 
includes  testing  the  input  data  against  two  hyperplanes  at 
the  ith/jth  node.  With  this  approach  the  two  output  vectors 
at  each  node  are  compared  to  one  another.  In  a noise-free 
environments  ith  perfect  classification,  the  two  label  vectors 
would  be  binary  opposites,  i.e.  one  label  vector  would  be  all 
0's  and  the  other  label  vector  would  be  all  l's.  This  technique 
enables  computation  of  theoretical  mean  square  errors  and 
empirical  mean  square  errors,  refer  to  Section  VI -B. 

Table  II  includes  a standard  T-DG  and  Tables  III  and  IV 
include  E-DGsf  or  a three  classD  DAG  with  a two  degree 
DOA  range  per  class.  The  two  levels  of  a three  class  DDAG 
are  equivalent  to  the  first  two  levels  of  a four  class  DDAG, 
refer  to  Figure  1.  Table  II  includes  the  possible  evaluation 
paths  of  thist  hree  class  DDAG.  The  nodesf  or  each  DOA 
evaluation  path  are  included  for  the  first  and  second  DDAG 
level.  For  example,  DOA  1 has  an  evaluation  path  of  Node 
1 vs  3 at  Level  1 and  Node  1 vs  2 at  Level  2.  In  Table  III 
E-DG  presents  the  decision  statistics  for  a signal  subspace 
eigenvector,  in  Table  IV  the  second  E-DG  presents  the 
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TABLE  II 

Theoretic  decision  grid  for  a DDAG  system  with  3 classes 
AND  A 2 DEGREE  DOA  RANGE. 


DOAs 

Classl  Class  2 Class  3 
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TABLE  III 

Empirical  decision  grid  for  a signal  eigenvector 
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TABLE  IV 

Empirical  decision  grid  for  a noise  eigenvector 
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Class  1 Class  2 Class  3 
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decision  statistics  for  a noise  subspace  eigenvector. 

B.  Theoretical  and  Empirical  MSEs 

The  difficulty  in  tracking  the  performance  of  the  LS-SVM 
DDAG  DOA  estimation  algorithm  is  duet  o the  numerous 
DDAG  evaluation  paths.  For  many  DDAGs  thee  valuation 
paths  can  be  determined  based  on  the  input  data  and  the 
class  definitions.  How  can  decision  statistics  be  applied  to 
performance  characterization? 

The  two  primary  performance  measures  for  the  LS-SVM 
DDAG  are  the  theoretical  MSE  (T-MSE)  and  the  empirical 
MSE  (E-MSE).  Both  MSE  performance  measures  are  based 
on  MSE  calculations  with  T-DGs  and  E-DGs.  The  T-MSE 
is  a MSE  calculation  between  the  corresponding  elements  of 
the  T-DG  and  the  E-DG.  This  is  a measure  of  the  algorithm’s 


empirical  decision  statistics  in  relation  to  the“  theoretical” 
decision  statistics.  For  example,  the  T-MSE  for  a 3 class 
DDAG  is  calculated  with  the  T-DG  and  E-DG  presented  in 
Tables  Ha  nd  III.  The  T-MSE  for  Class  2 is  calculated  as 


Level  1 

Level  2 

Label  0 
(0.5  - 0.032)2 

Label  1 
(0.5  - 0.576)2 

Label  0 

a - 1)2 

Label  1 
(0  - 0)2 

Unlike  the  T-MSE,  the  E-MSE  is  at  echnique  that  allows 
for  real-time  error  tracking  with  only  the  empirical  deci- 
sion statistics.  The  E-MSE  uses  onlyt  he  E-DGs  and  the 
differences  between  the  two  LS-SVM  decision  statistics  at 
each  node  in  the  evaluation  path.  Thisi  sa  measure  of  the 
empirical  classification  accuracy  achieved  at  each  DDAG 
node.  The  E-MSE  for  a3  class  DDAG  is  calculated  with 
only  the  E-DG  presented  in  Table  III.  The  MSE  for  Class  2, 
Level  1 is  (|0.032  - 0.576|  - l)2  = 0.208  and  the  MSE  for 
Class  2,  Level  2 is  (|1  — 0|  — l)2  = 0. 

C.  Misclassifications  vs.  Gross  Errors 

Two  secondary  performance  measures  for  the  LS-SVM 
DDAG  are  misclassifications  and  grosse  rrors.  These  mea- 
sures are  used  for  performance  characterization  of  the  multi- 
class LS-SVM  DDAG  DOA  estimation  algorithm  and  for 
tracking  variations  in  performance  for  various  algorithm 
parameters.  Misclassifications  and  gross  errors  can  not  be 
used  in  real  time  implementation  because  knowledge  of  the 
test  DOAs  is  required. 

Misclassifications  measure  “small  shifts”  in  DOA  clas- 
sifications. If  a DOA  is  located  near  a border  between 
labelst  he  machine  learning  processc  ould  classify  the  data 
to  an  adjacent  label,  not  the  closest  label.  Therefore,  a 
misclassification  is  a shift  related  error  where  a signal  is 
detected,  but  classified  to  a spatially  adjacent  label.  This  type 
of  error  still  gives  an  indication  of  the  received  DOA.  The 
region  of  misclassifications  is  defined  as  \ of  the  DOA  range 
applied  to  both  sides  of  a DOA  class. 

Gross  errors  measure  significant  errors  in  DOA  classifica- 
tions.! f a DOA  is  classified  into  a specific  class, b ut  spatially 
located  at  least  one  entire  class  away,  then  the  error  is  due 
to  a breakdown  in  the  machine  learning  process.  This  type 
of  error  assigns  false/misleading  information  to  a received 
DOA.  The  region  of  gross  errors  is  defined  as  the  magnitude 
of  the  DOA  range  applied  to  both  sides  of  the  DOA  class. 

Figure  4 displays  the  DOA  regions  for  correct  classi- 
fications, misclassifications  and  gross  errors.  This  specific 
example  is  for  a DDAG  class  centered  at  0°  with  a 5° 
DOA  range,  i.e.,  any  DOA  in  the  range  [-2,  2]  is  correctly 
classified  to  the  0°  class.  The  region  enclosed  by  the  dashed 
brackets  includes  all  DOAs  that  are  correctly  classified  at 
the  0°  class.  If  any  DOAso  utside  the  dashed  bracketsb  ut 
inside  the  solid  brackets  are  assigned  the  0°  class,  then  that 
DOA  would  be  a misclassification.  If  any  DOAs  outside  the 
solid  bracketsa  re  assigned  to  the  0°  class,  then  that  DOA 
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would  be  a gross  error.  The  misclassification  region,  for  a 
DOA  classified  at  0°,  is  DO  A 6 [—4,  —3] , [3,4] . The  gross 
error  region,  for  a DOA  classified  at  0°,  is  DOA  £ [-4,4] . 


Fig.  4.  Diagram  of  regions  definingD  OA  misclassifications  and  gross 
errors. 


D.  Kernel  Parameters 

Simulation  results  show  that  kernel  selection  has  the 
greatest  effect,  out  of  all  tunable  variables,  in  thee  lassifi- 
cation  process.  The  four  kernels  discussed  in  Section  III- 
A are  tested  with  the  LS-SVM  DDAG  DOA  estimation 
algorithm.  The  performances  of  each  kernel  function  and 
the  associated  parameters  are  characterized  with  in  terms  of 
MSE,  misclassifications,  and  gross  errors.  In  addition,  the 
LS-SVM  regularization  parameter,  7,  is  varied  to  show  the 
influence  of  the  LS-SVM  complexity. 

1)  Polynomial  Kernel : The  polynomial  kernel  provides 
the  best  results,  in  relation  to  the  RBF,  MLP,  and  linear 
kernels.  Figure  5 displays  the  T-MSE  in  terms  of  the  poly- 
nomial degree,  d,  and  constant,  6.  The  simulation  is  based 
on  af  our  class  DDAG  with  a 5°  DOA  rangea  nd  a fixed 
LS-SVM  variable,  7 = 2,  The  results  show  that  the  degree 
of  the  polynomial  kernel  affects  the  DOA  estimation;  the 
best  values  are  d = 2 and  d — 4.  For  d = 1 the  polynomial 
kemeli  s equivalent  to  the  linear  kernel.  The  MSE  is  constant 
for  1 < 7 < 6,  and  the  polynomial  constant,  0 , does  not 
influence  the  performance.  The  rate  of  misclassifications  is 
1.2%  with  zero  gross  errors.  The  degree  of  the  polynomial  is 
the  only  factor  affecting  the  computational  time  for  system 
training. 

2)  Radial  Basis  Function  Kernel : The  performance  of 
the  RBF  kernel  is  characterized  in  terms  of  the  LS- 
SVM  regularization  variable,  7,  and  the  smoothing  parame- 
ter, cr2.  The  simulation  is  based  on  a four  class  DDAG  with  a 
5°  DOA  range.T  he  results  show  that  the  MSE  is  constant  for 
7 > 1.5,  and  o1  > 0.5.  The  rate  of  misclassifications  is  0.4% 
with  zero  gross  errors.  The  training  time  increases  with  the 
value  of  7 and  for  small  values  of  cr2.  The  performance  of 


Fig.  5.  Theoretical  MSE  for  a the  polynomial  kernel,  the  DOA  range 
is  5°  and  spans  the  DDAG  classes  at  30°,  35°,  40°,  45°,  the  LS-SVM 
parameter,  7,  is  set  at  2. 


the  RBF  kernel  matches  the  performance  of  the  polynomial 
kernel  for  DOAs  in  the  range  of  15°  to  60°.  The  performance 
of  the  polynomial  kernel  exceeds  that  of  the  RBF  kernel  for 
DOAs  < 15°  and  > 60°. 

3)  Multilayer  Perceptron  Kernel : Results  show  that  the 
MLP  kernel  is  ineffective  in  maintaining  a low  MSE  for  the 
range  of  parameters  tested.  The  rate  of  misclassifications  is 
42.5%  and  the  rate  of  gross  errors  is  17,2%.  Overall  the 
performance  of  the  MLP  kernel  is  inferior  to  the  polynomial 
and  RBF  kernels. 

4)  Linear  Kernel:  The  linear  kernel  ise  quivalent  to  the 
polynomial  kernel  with  d = 1.  Large  MSE  values  show 
that  the  linear  kernel  is  not  effective  in  the  LS-SVM  DOA 
estimation  algorithm.  The  average  T-MSE  is  27.8%  and  the 
average  E-MSE  is  61.1%. 

E.  Training  and  Test  Vectors 

The  design  of  training  sequences  is  an  important  factor  in 
machine  learning  applications.  For  adaptive  antennaa  rrays 
the  training  sequences  represent  the  array  outputs  for  the 
C DOA  classes.  Three  specific  elements  of  the  training 
sequences  are  noise  variance,  training  vector  length,  and 
length  of  thes  amplec  ovariance  window.  The  requirement 
is  to  design  training  sequences  that  minimize  both  the 
training  error  and  generalization  error.  Empirical  analysis 
of  the  multiclass  LS-SVM  based  DOA  estimation  algorithm 
shows  that  training  error  is  effectively  zero;  the  hyperplane 
separation  of  the  data  in  the  feature  space  is  welld  efined  and 
separable.  In  this  papert  he  generalization  error  is  expressed 
in  terms  of  MSEs,  misclassifications  and  gross  errors. 

The  primary  method  for  training  LS-SVM  DDAG  systems 
for  DOA  estimation  is  based  on  synthetic  training  vectors 
generated  with  known  noise  power  and  preselected  vector 
lengths.  In  practice,  the  training  vectors  would  be  stored  in 
the  memory  of  the  receiver  that  employs  the  DOA  estimation 
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algorithm.  This  approach  allows  for  offline  training  of  the 
binary  LS-SVM  algorithms. 

Simulation  results  show  that  the  LS-SVM  DOA  estimation 
algorithm  is  robust,  in  terms  of  MSE,  when  analyzed  for  a 
range  ofS  IRs  in  the  training  vectors  and  the  test  signals.  In 
general,  the  noise  power  of  the  training  vectors  doesn’t  have  a 
dramatic  effect  on  the  generalization  error.  Simulations  were 
conducted  with  training  vectors  that  included  SIRs  in  the 
range  of  20  dB  to  7 dB.  Review  of  them  ^classification 
and  gross  error  statistics  show  that  training  vectors  with  noise 
variances  of  0.04  and  0.12,  which  correspond  to  SIRs  of  13 
dB  and  10  dB,  provide  the  best  performance. 

1)  Length  of  Training  and  Testing  Vectors : Figure  6 
includes  two  plots  of  average  theoretical  MSE  versus  training 
vector  length.  The  data  is  specific  to  a four  class  LS-SVM 
DDAG  system  with  a four  degree  polynomial  kernel.  The  two 
plots  show  that  the  window  length  of  the  sample  covariance 
matrix  does  not  impact  the  performance. L ikewise  there  is  no 
correlation  between  the  length  of  the  training  vector  and  the 
MSE.  The  results  in  Figure  6 are  based  on  test  vectors  with 
size  equivalent  to  the  training  vectors.  Figure  7 is  a 3D  plot 
of  the  theoretical  MSE  as  a function  of  vector  dimensions; 
the  dimensions  of  the  training  vectors  and  input  data  vectors. 
The  length  of  the  input  data  vector  ranges  from  0.5  to  2 times 
the  length  of  the  training  vectors.  The  data  shows  that  range 
of  input  data  vectors  has  no  effect  on  the  MSE  statistics. 


Fig.  6.  Average  theoretical  MSE  as  a function  of  training  vector  length. 
Two  data  plots  are  included;  one  plot  is  for  a sample  covariance  matrix  with 
a five  sample  window,  one  plot  is  for  a sample  covariance  matrix  with  a ten 
sample  window. 

Table  V shows  the  processing  times,  in  seconds,  required 
for  training  a four  class  LS-SVM  DDAG  system  with  a 
four  degreep  olynomial  kernel,  and  testing  the  input  data. 
The  results  Data  is  included  for  training  and  test  vectors 
that  range  from  25  samples  to  200  samples.  The  simulations 
were  conducted  with  a Pentium  4 running  at  2.5  GHz.  The 
processing  times  are  relative  to  the  computer  system  and  the 
level  of  optimization  applied  to  the  programming,  but  serve 


Fig.  7.  Theoretical  MSE  as  a function  of  training  vector  length  and  input 
vector  length.  The  LS-SVM  DDAG  system  includes  four  class  and  a four 
degree  polynomial  kernel.  The  test  window  multiplier  defines  the  input 
vector  length,  i.e.  the  input  vector  length  ranges  between  0.5  to  2 times 
the  training  vectorl  ength. 

TABLE  V 

Processing  times,  in  seconds,  for  one-vs-one  multiclass 

LS-SVM  FOR  DOA  ESTIMATION. 


25 

50 

75 

Vector  Size 
100  125 

150 

175 

200 

Train 

0.30 

0.94 

2.25 

4.49 

7.39 

11.27 

15.23 

20.38 

Test 

0.20 

0.23 

0.31 

0.47 

0.56 

0.66 

0.72 

0.91 

as  a basic  indicator  for  possible  hardware  implementation 
and  real-time  applications. 

The  data  in  this  section  shows  that  the  design  of  the 
training  vectors  is  important,  but  there  is  a tolerance  in  the 
selection  of  noise  power  and  training  vector  length.  The 
available  tolerance  in  choosing  parameters  of  the  training 
vectors  validates  the  design  of  the  LS-SVM  DOA  estimation 
algorithm.  This  characteristic  allows  flexibility  in  the  system 
design  and  provides  a high  confidence  level  in  the  DOA 
estimates.  In  addition, w hen  considering  real-time  implemen- 
tation of  the  algorithm,  the  dimensions  of  the  training  vector 
must  be  carefully  reviewed.  Shorter  training  vectors  offer 
high  performance,  in  terms  of  MSE,  and  fast  training  times. 

F Range  of  DDAG  Parameters  for  DOA  Estimation 

The  exceptional  performance  of  the  LS-SVM  DDAG  DOA 
estimation  algorithm  has  been  proved  in  thep  revious  sec- 
tions. Most  thep  revious  simulation  results  wereb  ased  on 
three  and  four  class  DDAGs.  To  cover  the  desired  span  of 
the  antenna  array  sector  the  algorithm  must  be  flexible  in  the 
numbero  f DDAG  classes  and  DOA  ranges.  Different  appli- 
cations require  different  DDAG  architectures.  Many  times 
the  application  will  require  fast  training  and  high  accuracy. 
Training  a LS-SVM  DDAG  system  can  be  performed  offline. 
But  covering  a large  antenna  sector  with  high  resolution 
would  require  either: 
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TABLE  VI 

Percentage  of  misclassifications  versus  DDAG  classes  (3-6) 
AND  DOA  RANGES  (1-10). 


DDAG  cycle  for  each  eigenvector  the  multi  class  algorithm 
has  the  capability  of  assigning  multiple  labelst  o the  input 
signal. 


DOA  Range  between  Classes,  Degrees 


Classes 

I 

2 

3 

4 

5 

6 

7 

8 

9 

10 

3 

0 

0 

0 

0 

6.7 

0 

4.8 

4.2 

0 

0 

4 

0 

0 

0 

0 

0 

0 

3.6 

3.1 

0 

0 

5 

0 

0 

0 

0 

4.0 

0 

2.9 

0 

6.7 

0 

6 

0 

0 

0 

0 

0 

0 

4.8 

0 

5.6 

0 

1)  A DDAG  with  a large  number  of  classes  and  a small 
DOA  range, 

2)  A two  stage  system  where  the  antenna  sector  is  parti- 
tioned into  a set  number  of  classes  with  a wide  DOA 
range.  First,  the  signal  is  detected  in  a specific  partition, 
then  a DDAG  structure  for  high  resolution  can  classify 
the  DOA  with  high  accuracy 

Whatever  the  desired  approach  is,  the  LS-SVM  DDAG  algo- 
rithm must  be  flexible  in  design  and  robust  in  performance. 

The  data  in  this  section  proves  the  performance  for  a wide 
range  of  DDAG  structures.  Simulations  were  conducted  for 
three  to  ten  classes  with  DOA  ranges  between  1°  and  20°. 
With  these  classes  and  DOA  ranges  the  LS-SVM  DDAG 
algorithms  is  able  to  span  antenna  sectors  of  3°  to  90°. 
Table  VI  lists  the  number  of  misclassifications.  Seventy-five 
percent  of  the  DDAG  structures  with  DOA  ranges  between 
1°  and  10°  have  zero  misclassifications;  the  average  rate  of 
misclassifications  for  the  set  of  DDAG  structures  is  1.2%. 
The  largest  percentage  of  misclassifications  is  6.7%  and 
occurs  with  a five  class  DDAG  with  a nine  degree  DOA 
range. 

G.  Multilabel  Capability  for  Multiple  DOAs 

In  DOA  estimation  for  cellular  systems,  there  can  be 
multiple  DOAs  for  a given  signal.  This  results  from  multipath 
effects  induced  by  the  communication  channel.  The  machine 
learning  system  must  be  able  to  discriminate  between  a small 
number  of  independent  DOAs  that  include  signal  components 
with  similar  time  delays.  With  this  constraint  the  machine 
learning  algorithm  then  mustb  e a multiclass  system  and  able 
to  process  multiple  labels. 

The  machine  learning  algorithm  must  generate  multiclass 
labels,  yi  e C,  where  C e [-90, 90]  is  a set  of  real  numbers 
that  represent  an  appropriate  range  of  expected  DOA  values, 
and  multiple  labels  yi,i  = 1 . . . L for  L dominant  signal 
paths.  If  antenna  sectoring  is  used  in  the  cellular  system  the 
multiclass  labels  are  from  the  set  C e [Si],  where  Si  is  field 
of  view  for  the  ith  sector. 

Multilabel  classification  is  possible  with  the  LS-SVM 
DDAG  algorithm  presented  in  Section  V.  The  machine 
learning  algorithm  for  DOA  estimation  assigns  DOA  labels 
to  each  eigenvector  in  the  signal  subspace.  By  repeating  the 


VII.  Comparison  to  Standard  DOA  Estimation 
Algorithms 

Thep  erformanceo  f the  one-vs-one  multiclass  LS-SVM 
algorithm  for  DOE  estimation  is  described,  in  detail,  in 
the  previous  section.  The  results  show  that  the  multiclass 
classification  approach  to  DOA  estimation  providesu  nique 
benefits, i n terms  of  computational  complexity  and  flexibility. 
Each  algorithm  is  trained  for  C DOA  classes.  The  number 
of  classes  is  dependent  upon  on  the  antenna  sectoring  and 
required  resolution.  The  ideal  application  of  this  technique 
is  CDMA  cellular  systems.  For  a CDMA  system  the  desired 
interference  suppression  dictates  the  fixed  beamwidth.  A 
reduction  in  beamwidth  corresponds  to  a reduction  in  MAI, 
thus  reducing  the  required  transmit  power  at  the  mobile 
subscriber.  CDMA  offers  this  flexibility  since  the  all  mobiles 
use  the  same  carrier  frequency.  For  Frequency  Division 
Multiple  Access  (FDMA)  systems  a narrow  beamwidth  is 
desired,  since  frequency  reuse  factors  into  the  capacity  of  a 
cellular  system,  thus  requiring  accurate  DOA  estimates  with 
high  resolution. 


A.  Computational  Complexity 

Conventional  subspace  based  DOA  estimation  algorithms, 
such  as  MUSIC  and  ESPRIT,  are  computationally  complex. 
The  algorithms  require  accurate  knowledge  of  the  signal 
subspace  dimension  and  accurate  estimates  of  the  signal 
and  noise  subspace  eigenvectors.  Additionally,  the  MUSIC 
algorithm  requires  a precise  characterization  of  the  antenna 
array  and  the  ESPRIT  algorithm  requires  multiple  eigen 
decompositions. 

Theo  ne-vs-onem  ulticlass  LS-SVM  algorithm  for  DOA 
estimation  is  flexible,  with  respect  to  computationally  re- 
quirements. The  training  cycle  for  the  LS-SVM  based  DDAG 
iss  traight  forward  and  can  be  completed  offline  with  sim- 
ulated data.  The  only  information  required  is  the  size  of 
the  antenna  array  and  the  number  of  DDAG  nodes,  which 
corresponds  to  DOA  classes. F or  accurate  DOA  estimates  the 
only  information  required,  for  the  LS-SVM  DDAG  testing 
cycle,  is  the  dimension  of  thea  ntenna  array  and  accurate 
eigenvector  estimates  of  the  sample  covariance  matrix.  The 
dimension  of  the  signal  subspace  is  not  required,  nor  is 
accurate  characterization  of  the  antenna  array. 


B.  Simulation  Results 

Figure  8 compares  the  one-vs-one  multiclass  LS-SVM 
DOA  estimation  algorithm  and  the  MUSIC  algorithm.  The 
top  window  shows  perfect  DOA  estimation  for  the  machine 
learning  method  presented  in  this  paper.  The  multiclass 
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algorithm  includes  an  eight  class  DDAG  and  a one  de- 
gree DOA  range  per  class.  Note  that  multiclass  LS-SVM 
algorithm  classifies  signalso  utside  the  DOA  classest  o the 
nearest  class,  as  shown  with  theD  OAs  at  12°  — 14°  and 
23°  — 25°.  The  bottom  window  displays  the  DOA  estimation 
with  the  MUSIC  algorithm,  1 00  DOA  estimates  are  averaged 
for  each  received  signal  and  the  amplitudes  are  normalized 
to  the  largest  estimate.  Thep  lots  show  that  ther  esolution 
capabilities  one-vs-one  multiclass  LS-SVM  DOA  estimation 
algorithm  equal  that  of  the  MUSIC  algorithm. One  drawback 
of  the  MUSIC  algorithm  is  the  broad  width  of  the  DOA 
estimate;  a level  detection  step  is  required  to  accurately  select 
the  maximum  response. 

Figure  9 compares  the  errors  and  DOA  estimates  ofe  ach 
algorithm.  For  this  simulation  the  one-vs-one  multiclass  LS- 
SVM  algorithm  includes  a seventeen  class  DDAG  and  a 
five  degree  DOA  range  per  class.  The  top  window  plots 
the  errorsi  n the  DOA  estimates  for  ninety  degree  antenna 
sector  and  one  DOA  sample  per  degree.  The  definitionso  f 
an  error  are  specific  to  the  two  algorithms.  For  the  machine 
learning  based  algorithm,  an  error  is  defined  as  a DOA  that  is 
classified  into  a wrong  DOA  class.  For  the  MUSIC  algorithm 
an  error  is  the  difference  between  the  estimated  DOA  and 
the  actual  DOA.  As  shown  in  the  top  window,  the  only 
errors  associated  with  theL  S-SVM  based  algorithm  occur 
for  DOAs  greater  than  82°.  The  DOAs  in  error  are  classified 
into  the  spatially  adjacent  DOA  class  at  80°.  Likewise,  the 
errors  associated  with  the  MUSIC  algorithm,  that  are  greater 
than  l°,o  ccur  for  DOAs  greater  than  70°.  The  plots  in  Figure 
9 prove  the  robust  performance  of  the  one-vs-one  multiclass 
LS-SVM  algorithm  for  DOA  estimation. 

DOAs  Calculated  with  LS-SVM 


£ 9 q q g q q q^ML  DOA  Estimates 
* ********  ii*  *DOA  Test  Signals 


5 10  15  20  25  30 

DOAs 

DOAs  calculated  with  MUSIC  and  ED 


5 10  15  20  25  30 


DOAs 


DOAs 


Fig.  9.  Comparision  of  errors  and  estimated  DOAs  for  the  LS-SVM 
based  DOA  estimation  algorithm  and  the  MUSIC  algorithm.  The  one-vs-one 
multiclass  LS-SVM  DOA  estimation  algortihm  includes  seventeen  classes 
and  a five  degree  DOA  range. 


C.  Benefits  over  Standard  Techniques 
Evaluation  of  the  performance  statistics,  Section  VI, 
proves  that  the  one-vs-one  multiclass  LS-SVM  algorithm  for 
DOA  estimation  is  reliable  with  a high  degree  of  accuracy.  In 
terms  of  performance  our  new  algorithm  provides  the  same 
capabilities  as  the  standard  DOA  estimation  methods.  S pecif- 
ically,  accurate  DOA  estimates,  to  a one  degree  resolution, 
can  be  achieved  with  the  standard  subspace  based  algorithms 
and  our  machine  learning  based  algorithm.  The  primary 
benefits  of  our  LS-SVM  based  DOA  estimation  algorithm  are 
the  reduced  computational  complexity,  described  above,  and 
the  flexibility,  in  terms  of  DOA  classes  versus  requirements. 
The  specific  application  dictatest  he  desired  resolution  and 
therefore  the  number  of  DOA  classes.  For  example,  one 
application  may  include  a sixty  degree  antenna  sector  and  a 
desired  resolution  of  ten  degrees.  These  requirements  would 
translate  into  a seven  class  system.  Another  application  may 
include  a twenty  degree  sector  and  a desired  resolution  of  two 
degrees;  this  would  translate  into  a eleven  class  system.  An 
additional  option  is  to  place  two  DDAG  systems  in  series,  as 
described  in  Section  VI-F,  that  allows  for  a high  resolution 
with  a small  number  of  classes.  In  general,  the  one-vs-one 
multiclass  LS-SVM  algorithm  for  DOA  estimation  can  be 
adapted  to  specific  requirements,  as  influenced  by  system 
capacity,  channel  conditions,  and  available  computational 
resources.  The  MUSIC  and  ESPRIT  algorithms  offer  no 
flexibility,  in  terms  of  DOA  resolution  and  computational 
resources. 


Fig.  8.  Comparision  between  the  LS-SVM  based  DOA  estimation  algorithm 
and  the  MUSIC  algorithm.  The  one-vs-one  multiclass  LS-SVM  DOA 
estimation  algortihm  includes  eight  classes  and  a one  degree  DOA  range. 


VIII.  Conclusion 

In  this  paper  we  presented  a machine  learning  architecture 
for  DOA  estimation  as  applied  to  a CDMA  cellular  system. 
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The  broad  range  of  our  research  in  machine  learning  based 
DOA  estimation  includesm  ulticlass  and  multilabel  classifi- 
cation, classification  accuracy,  error  control  and  validation, 
kernel  selection,  estimation  of  signal  subspace  dimension, 
and  overall  performance  characterization.  Wep  resented  an 
overview  of  a multiclass  SVM  learning  method  ands  uc- 
cessful  implementation  of  a one-vs-one  multiclass  LS-SVM 
DDAG  system  forD  OA  estimation. 

The  LS-SVM  DOA  estimation  algorithm  is  superior  to 
standard  techniquesd  ue  to  the  robust  design  that  is  insen- 
sitive to  received  SIR,  Doppler  shift,  size  of  the  antenna 
array,  and  the  computational  requirementsa  re  adaptable  to 
the  desired  applications.  The  algorithm  was  designed  with 
a multiclass,  multilabel  capability  and  includes  an  error 
control  and  validation  process.  In  addition,  there  are  many 
limitations  of  standard  DOA  estimation  algorithms,  ESPRIT 
and  MUSIC,  that  do  not  exist  with  the  LS-SVM  DOA 
estimation  algorithm. 

The  LS-SVM  algorithm  for  DOA  estimation  assigns  DOA 
labels  to  each  eigenvector  in  the  signal  subspace.  By  re- 
peating the  DDAG  cycle  for  each  eigenvector  the  multiclass 
algorithm  hast  he  capability  of  assigning  multiple  labelst  o 
the  input  signal.  Simulation  results  show  a high  degree  of 
accuracy  and  prove  that  the  LS-SVM  DDAG  system  has  a 
wide  range  of  performance  capabilities.  The  results  show 
that  the  algorithm  is  accurate  for  a large  range  of  DDAG 
performance  independent  of  DDAG  class  or  DOA  range  per 
class.  The  LS-SVM  DDAG  system  accurately  classifies  the 
DOAs  for  three  to  ten  classes  and  DOA  ranges  from  one 
degree  to  twenty  degrees. 
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